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ABSTRACT 

We  consider  here  edge  detection  as  a  problem  of  identify¬ 
ing  points  in  an  image  where  the  image  brightness  changes 
sharply  or  has  discontinuities.  Specifically,  we  investigate 
several  well-known  algorithms  for  edge  detection  and  mod¬ 
ify  them  for  performance  directly  on  the  bitstream  of  a  JPEG 
compressed  image.  By  performing  the  processing  in  the  com¬ 
pressed  domain,  we  dramatically  reduce  its  complexity.  The 
low  complexity  combined  with  the  fact  that  it  does  not  re¬ 
quire  JPEG  decoding  makes  this  approach  well  suited  for  dis¬ 
tributed  wireless  sensor  networks. 

To  extract  information  from  JPEG  compressed  images, 
we  can  use  the  number  of  bits  allocated  to  each  of  the  8x8 
subblocks.  Using  this  preponderance  number,  we  are  able 
to  extract  information  regarding  the  frequency  content  of  the 
image.  In  this  paper  we  will  present  a  comparison  of  first- 
order,  differential,  and  phase  congruency  approaches  to  edge 
detection  and  determine  how  these  algorithms  perform  when 
applied  directly  on  the  bitstream  domain.  The  performance  of 
the  edge  detectors  will  be  determined  by  using  the  Eigure  of 
Merit  (EOM)  and  Closest  Distance  Metric  (CDM). 

Index  Terms —  JPEG-based  edge  detection,  sensor  net¬ 
work  coding,  edge  detection,  bitstream  processing 

1.  INTRODUCTION 

Edge  detection  is  a  vital  part  of  image  processing,  which  is 
used  for  extracting  important  features  from  an  image.  In  many 
cases,  we  are  only  concerned  with  the  shape  of  an  object, 
or  recognizing  some  part  of  it.  Edge  detection  is  especially 
important  for  applications  in  computer  vision.  Eor  our  pur¬ 
poses,  the  main  idea  is  to  find  areas  where  discontinuities  ex¬ 
ist.  These  discontinuities  generally  include  step  edges,  line 
edges,  and  junctions  [1]. 

Because  of  the  importance  of  edge  detection,  researchers 
have  already  come  up  with  many  methods  for  finding  the 
edges  in  an  image.  These  methods  include  Sobel,  Canny, 
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Laplacian  of  Gaussian,  and  others.  Moreover,  nearly  all  meth¬ 
ods  of  edge  detection  use  a  smoothing  step.  This  step  is  per¬ 
formed  in  order  to  help  minimize  false  edges  due  to  noise, 
and  other  small  details.  Another  step  that  is  commonly  seen 
in  edge  detection  is  edge  labeling  which  involves  touching  up 
the  image,  suppressing  false  edges,  and  connecting  edges  [1] . 
We  will  be  focusing  on  the  actual  process  of  detecting  edges. 

Our  interest  lies  in  applying  edge  detection  methods  to  the 
bitstream  domain.  After  JPEG  encoding  the  number  of  bits 
used  to  encode  each  8x8  subblock  gives  us  an  idea  of  the 
attributes  of  an  image.  This  can  have  several  advantages  in 
image  processing,  because  edge  detection  can  be  performed 
on  a  reduced  image  data  set  without  completely  decoding  the 
bitstream.  This  can  further  be  applied  to  distributed  video 
coding  applications  very  easily. 

The  proposed  algorithm  has  been  developed  while  explor¬ 
ing  alternative  methods  to  steps  in  [2] .  In  particular,  we  pro¬ 
pose  a  method  of  edge  detection  to  aid  in  identifying  clusters 
of  pixels  that  have  similar  frequency  structure  as  shown  in 
[2,  3].  These  clusters  can  be  used  to  identify  areas  of  high 
correlation  between  multiple  images.  Eor  example,  these  im¬ 
ages  can  be  a  product  of  either  a  video  sequence  or  spatially 
separated  cameras  with  an  overlapping  field  of  view.  In  this 
application,  emphasis  is  placed  on  reducing  computational 
complexity  as  well  as  producing  accurate  edge  maps  in  or¬ 
der  to  identify  patterns  in  the  two  different  images.  Other 
applications  may  include  extracting  information  from  large 
collections  of  images,  such  as  databases  or  the  Internet. 

2.  BIT-DOMAIN  EDGE  DETECTION 

The  details  of  the  JPEG  compression  algorithm  are  fully  de¬ 
scribed  in  [4],  but  what  is  important  in  our  application  is  that  it 
compresses  images  using  8x8  blocks  of  pixels  in  a  largely  in¬ 
dependent  manner.  This  allows  the  information  in  each  block 
to  be  represented  by  the  number  of  bits  necessary  to  encode 
it,  known  as  ’’the  preponderance”.  Consequently,  the  pattern 
created  by  the  bit  counts  of  the  8x8  pixel  blocks  that  form 
regions  of  an  image  provide  us  with  information  about  the 
spatial  composition  of  those  regions  which  can,  in  theory,  be 
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used  for  image  processing.  We  propose  here  to  use  the  pat¬ 
terns  formed  by  these  bit  allocations  instead  of  the  actual  pix¬ 
els  to  perform  edge  detection.  A  method  for  extracting  the 
preponderance  values  from  the  bitstream  is  detailed  in  [2].  In 
addition  to  the  spatial  information,  since  the  JPEG  encoding 
step  uses  an  entropy  encoder,  the  preponderance  values  pro¬ 
vide  us  with  course  estimates  of  the  entropy  of  each  8x8 
block  of  pixels.  An  example  of  the  preponderance  values  are 
shown  in  Figure  1 . 


Fig.  1.  Preponderance  image  of  Lena 


The  proposed  algorithm  is  a  four  step  process  and  is  illus¬ 
trated  in  Figure  2.  First,  an  averaging  filter  is  applied  to  the 
Preponderance  Block  (PB)  image,  the  2-D  array  containing 
the  preponderance  numbers.  This  step  is  helpful  in  reducing 
false  alarms  in  edge  detection  by  removing  some  of  the  less 
significant  edges.  We  have  found  a  filter  mask  of  3  x  3  tends  to 
give  us  an  acceptable  amount  of  false  alarm  reduction  without 
removing  too  much  of  the  useful  information. 

Following  the  averaging  filter,  we  determine  the  gradient 
of  the  PB  image.  To  do  this,  we  filter  the  PB  image  once  hor¬ 
izontally  and  once  vertically  using  a  Sobel  edge-emphasizing 
filter  [5].  Each  filtering  will  result  in  a  component  gradient 
in  the  filtered  direction,  in  the  x  direction  and  ly  in  the  y 
direction.  Equation  (1)  gives  the  magnitude  of  the  gradient 
by  calculating  the  Euclidean  distance  of  the  sum  of  the  two 
component  gradients.  The  resulting  gradient  is  high  around 
edges  and  tends  to  be  low  elsewhere. 

1=^/2+  /2  (1) 

Lastly,  we  use  fc-means  clustering  to  divide  the  gradient 
magnitude  into  a  binary  map  of  edges  and  non-edges. 


Fig.  2.  Block  diagram  of  the  proposed  algorithm 

3.  COMPUTATIONAL  COMPLEXITY 

By  performing  the  segmentation  process  directly  on  the  bit- 
stream,  we  are  able  to  reduce  processing  requirements  for  our 
algorithm.  By  not  requiring  the  decoding  of  the  JPEG  image, 
we  eliminate  the  need  to  decode  the  Huffman  run-length  code 
and  compute  the  inverse  Discrete  Cosine  Transform  (DCT). 
Determining  the  complexity  for  decoding  the  Huffman  code 
is  a  challenging  problem  since  the  number  of  table  lookups 
can  vary  depending  on  the  quantization  and  image  type.  Here, 
we  have  computed  the  average  number  of  table  lookups  re¬ 
quired  to  decode  each  image  in  the  Berkeley  database  and 
determined  that  each  8x8  block  requires  about  1 1  lookups. 
The  2-D  DCT  can  be  implemented  using  the  2-D  EFT  and  as 
a  result  an  TV  x  W  image  with  D  x  D  sized  2-D  DCT  blocks, 
where  N  >  D  has  a  complexity  of  0((ZJlog2  *  (5)^) 
multiplications.  For  an  512  x  512  JPEG  encoded  image,  we 
have  an  initial  savings  of  about  45,000  lookups  and  about 
1.13  X  10®  multiplications  that  is  achieved  just  by  avoiding 
the  decoding  process. 

Since  we  are  only  processing  the  number  of  bits  used  to 
encode  each  8x8  block,  we  reduce  the  complexity  of  each 
step  in  our  segmentation  process  by  8  in  each  dimension.  For 
filtering,  this  allows  us  to  reduce  our  complexity  by  a  factor 
of64;l. 

4.  RESULTS 

Here,  we  present  quantitative  edge  detection  results  for  the 
proposed  algorithm  using  a  variety  of  metrics  and  techniques. 
To  quantitatively  measure  performance,  we  will  use  the  met- 


rics  discussed  next. 

4.1.  Metrics 

In  order  to  compare  our  results  against  conventional  methods, 
we  will  be  relying  on  two  objective  metrics.  The  first  Method 
will  be  Pratts  Figure  of  Merit  (FoM)  [6].  We  will  also  use 
the  Closest  Distance  Metric  (CDM)  described  by  Prieto  and 
Allen  [7]. 

The  first  metric  we  used  was  the  Figure  of  Merit  as  de¬ 
scribed  by  [6].  By  making  use  of  a  ground  truth  image,  this 
metric  compares  distances  of  the  measured  edges  to  the  actual 
edges  in  the  ground  truth  images.  The  formula  for  FoM  is  as 
follows: 


max{Pm,  Pa)  1  +  a  X  (P(i)  ^  ^ 

where  Pm  is  the  number  of  edge  pixels  in  the  calculated 
edge  image.  Pa  is  the  number  of  pixels  in  the  actual  edge 
image,  a  is  a  constant  usually  set  to  |,  and  d{i)  is  the  distance 
between  calculated  and  actual  pixels.  In  this  paper  we  have 
scaled  the  FoM  to  range  from  [0,100]  to  more  easily  compare 
the  FoM  to  the  CDM. 

The  other  metric  we  used  was  the  Closest  Distance  Metric 
as  described  by  [7].  This  metric  differs  from  FoM  in  that 
it  looks  for  matches  between  the  calculated  and  actual  edge 
images  within  a  certain  distance  ry.  It  also  takes  into  account 
the  possibility  that  no  match  is  found.  Below  is  the  formula 
for  CDM: 

CDAf,K<.)  =  100(1 (3) 

|mUa| 

q{CM)  =  l-<r((z,j),(fc,[))x(l-|m(i,j)-a(fc,/)|)  (4) 

^{{hj),{k,l))  =  E{max{\k  -  i\,\l  -  j\))  (5) 

where  {k,  1))  is  the  normalized  checkerboard  dis¬ 

tance  between  the  closest  matching  pixel  in  the  calculated  and 
ground  truth  images.  The  values  m{i,j)  and  a{k,  1)  are  the 
pixel  values  (0  or  1)  of  the  calculated  and  actual  edge  images 
respectively,  and  m  U  a  is  the  total  number  of  edge  pixels  in 
both  images. 

For  initial  performance  evaluation  we  will  compare  the 
performance  of  the  proposed  algorithm  and  the  well-known 
Sobel,  Canny,  and  Laplacian  of  Gaussian  methods.  The  pro¬ 
posed  algorithm  will  be  implemented  using  a  3  x  3  averaging 
filter.  In  this  first  experiment  we  will  apply  all  techniques  di¬ 
rectly  on  the  preponderance  information  to  develop  a  baseline 
for  performance.  The  respective  edge  detection  schemes  are 
applied  to  a  series  to  50  Brodatz  texture  mosaics  [8]  shown  in 


Table  1.  Average  performance  on  Brodatz  textures 


EoM 

CDM 

Sobel 

43.50 

79.23 

Canny 

73.77 

82.25 

Laplacian  of  Gaussian 

75.77 

82.93 

Bitstream  Based 

83.02 

85.55 

Metrics  range  [0,  100],  higher  is  better. 

Bold  denotes  highest  value  for  a  given  metric. 


Figure  3.  The  purpose  of  this  experiment  is  to  measure  how 
each  individual  algorithm  will  perform  given  the  same  infor¬ 
mation,  a  lossy  interpretation  (preponderance  information)  of 
images  with  diagonal  edges. 


(a)  (b)  (c)  (d) 


Fig.  3.  Example  results  from  Experiment  1 :  a)  Original  Image 
b)  Ground  Truth  c)  Proposed  Method  d)  Canny  Method. 

Eigure  3  shows  a  sample  of  the  results  from  the  first  exper¬ 
iment.  By  inspection,  the  Proposed  method  outperforms  the 
Canny  method.  Table  1  shows  the  quantitative  results  from 
the  experiment.  Using  both  the  Eigure  of  Merit  and  the  Clos¬ 
est  Distance  Metrics,  the  proposed  bitstream  based  algorithm 
outperforms  the  most  commonly  used  edge  detection  filters. 
On  the  average,  our  technique  can  improve  upon  the  compu¬ 
tationally  complex  Canny  edge  detector  by  over  12%  using 
EoM  and  over  4%  using  CDM.  While  these  metrics  were  not 
designed  for  use  in  the  preponderance  domain,  this  experi¬ 
ment  is  useful  in  demonstrating  what  is  possible  to  do  given 
such  a  heavily  reduced  data  set. 

The  second  experiment  follows  the  procedure  of  the  first, 
but  instead  uses  natural  images.  This  experiment  uses  the  four 
images  in  Eigure  4  and  the  ground  truth  provided  by  [9].  All 
four  images  test  different  aspects  of  the  edge  detection  prob¬ 
lem,  and  all  four  edge  detection  schemes  are  applied  directly 
on  the  preponderance  domain  information  to  determine  how 
each  performs  given  the  same  amount  of  information. 

The  results  in  Table  2  demonstrate  the  performance  of 
the  four  metrics  when  applied  directly  on  the  preponderance 
information.  In  all  cases,  the  proposed  bitstream  edge  de¬ 
tection  algorithm  performs  better  than  the  other  three  meth¬ 
ods.  At  first  glance,  it  appears  that  the  three  commonly  used 
methods,  Sobel,  Canny,  and  Laplacian,  perform  rather  poorly. 
However,  it  is  important  to  note  that  those  three  edge  detec¬ 
tion  schemes  were  not  designed  with  preponderance  numbers 
in  mind,  unlike  the  proposed  bitstream  processing  technique. 
To  better  understand  how  our  edge  detection  system  performs 


Fig.  4.  Test  images  for  Experiments  2  and  3  from  [9]. 


when  compared  to  the  other  edge  detection  schemes  when 
applied  on  the  image  directly  we  will  have  to  conduct  a  third 
experiment. 

This  last  experiment  demonstrates  the  performance  of 
the  proposed  algorithm  versus  well-known  edge  detectors 
on  natural  images.  In  this  case,  performance  of  the  Sobel, 
Canny,  and  Laplacian  of  Gaussian  methods  will  be  demon¬ 
strated  on  the  image  domain  while  the  proposed  algorithm 
will  still  function  on  the  preponderance  information.  Since 
the  size  of  the  edge  map  from  the  bitstream  based  method 
is  reduced  in  scale  by  1  /8  and  thus,  has  a  lower  resolution 
edge  map,  a  scaled  down  version  of  the  ground  truth  is  used 
for  calculating  the  results.  To  scale  down  the  ground  truth 
the  image  is  resized  using  cubic  interpolation  and  followed 
by  thresholding.  This  may  result  in  a  small  advantage  for  the 
preponderance  domain  processing  since  the  preponderance 
domain  tends  to  remove  noise  and  small  details.  However,  the 
results  from  the  previous  experiments  show  that  the  proposed 
bitstream  based  method  beats  the  other  algorithms  when  pro¬ 
cessing  directly  on  the  bitstream.  It  is  important  to  note  that 
bit-domain  processing  produces  lower  resolution  edge  maps; 
however,  for  our  application,  smaller  data  sets  are  desired  for 
reduced  computational  complexity.  It  is  possible  to  correct 
the  problem  of  lower  resolution  edge  maps  by  following  the 
proposed  algorithm  with  an  interpolator  to  scale  up  the  edge 
maps. 

The  results  shown  in  Table  3  demonstrate  very  good  per¬ 
formance  by  the  proposed  algorithm  even  when  compared  to 
other  non-bitstream  based  approaches  operating  in  the  pixel 
domain.  In  most  cases,  the  bitstream  based  method  was  su¬ 
perior  to  the  pixel-domain  methods  while  having  much  lower 


Table  2.  Performance  in  the  preponderance  domain. 


LoM  CDM 

Blocks 

Sobel 

Canny 

Laplacian  of  Gaussian 
Bitstream  Based 

7.53  63.78 

39.81  86.49 

30.54  85.85 

66.52  88.31 

Road 

Sobel 

Canny 

Laplacian  of  Gaussian 
Bitstream  Based 

12.02  68.95 

42.29  83.17 

36.19  83.69 

71.88  84.82 

Saturn 

Sobel 

Canny 

Laplacian  of  Gaussian 
Bitstream  Based 

24.27  79.88 

43.18  88.58 

40.87  86.13 

91.30  91.98 

Wall 

Sobel 

Canny 

Laplacian  of  Gaussian 
Bitstream  Based 

9.49  67.61 

40.86  83.78 

32.55  86.12 

87.65  88.61 

Metrics  range  [0,  100],  higher  is  better. 

Bold  denotes  highest  value  for  a  given  metric. 


Table  3.  Performance  in  the  respective  domains. 


LoM  CDM 

Blocks 

Sobel 

Canny 

Laplacian  of  Gaussian 
Bitstream  Based 

47.40  80.75 

60.61  78.86 

68.37  79.87 

66.52  88.31 

Road 

Sobel 

Canny 

Laplacian  of  Gaussian 
Bitstream  Based 

50.23  80.12 

41.82  70.77 

57.79  77.05 

71.88  84.82 

Saturn 

Sobel 

Canny 

Laplacian  of  Gaussian 
Bitstream  Based 

72.40  91.91 

57.37  79.49 

70.53  85.94 

91.30  91.98 

Wall 

Sobel 

Canny 

Laplacian  of  Gaussian 
Bitstream  Based 

63.80  77.47 

30.90  63.87 

37.00  65.86 

87.65  88.61 

Metrics  range  [0,  100],  higher  is  better. 

Bold  denotes  highest  value  for  a  given  metric. 


complexity. 

5.  CONCLUSION  AND  FUTURE  WORK 

In  this  paper,  we  have  introduced  a  novel  approach  to  edge  de¬ 
tection  using  bitstream  based  processing.  We  have  shown  that 
by  doing  processing  directly  on  the  bitstream  we  are  able  to 
reduce  computational  complexity  for  use  in  applications  such 
as  distributed  sensor  networks  and  JPEG  database  searching. 
We  have  compared  the  proposed  segmentation  techniques  to 
other  commonly  used  techniques  and  the  bitstream-based  pro¬ 
cedure  is  usually  able  to  achieve  performance  that  is  as  good 
as  or  better  than  that  obtained  from  other  techniques.  Its  com¬ 
petitive  performance,  combined  with  its  reduced  complexity 
certainly  recommend  it  as  an  alternative  method  for  complex¬ 
ity  sensitive  applications.  By  operating  on  a  reduced  data 
set  and  avoiding  the  decoding  process,  the  complexity  sav¬ 
ings  has  been  shown  to  be  significant.  This  type  of  edge  de¬ 
tection  technique  lends  itself  to  multimedia  search  applica¬ 
tions  which  seek  to  extract  information  from  large  databases 
of  JPEG-compressed  images. 

We  have  only  considered  here  JPEG  compressed  rep¬ 
resentations  of  images.  However,  the  use  of  other  lossy 
compression  algorithms  which  produce  relatively  informative 
compressed  representations  can  also  be  explored.  Since  the 
bit  allocations  give  us  a  rough  approximation  of  an  image, 
we  can  also  investigate  the  performance  of  other  previously 
developed  edge  detection  algorithms  on  the  bitstream. 
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